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Abstract 

In this article we suggest a new approach for scale-free networks generation with 
an alternative source of the power-law degree distribution. It comes from matrix 
factorization methods and geographical threshold models that were recently 
proven to show good results in scale-free networks generation. 

We associate each node with a latent features vector distributed over a unit 
sphere and with a weight variable sampled from a Pareto distribution. We join 
two nodes by an edge if they are spatially close and/or have large weights. The 
network produced by this approach is scale-free and has a power-law degree 
distribution with an exponent of 2. In addition, we propose an extension of the 
model that allows us to generate directed networks with tunable power-law 
exponents. 

Keywords: scale-free networks; matrix factorization; threshold models 


1 Introduction 

Most social, biological, topological and technological networks display distinct 
non-trivial topological features demonstrating that connections between the nodes 
are neither regular nor random at the same time [1]. Such systems are called complex 
networks. On of the well-known and well-studied classes of complex networks is 
scale-free networks whose degree distribution P{k) follows a power law P{k) ^ fc““, 
where cr is a parameter whose value is typically in the range 2 < a < 3. Many real 
networks have been reported to be scale-free [2]. 

Generating scale-free networks is an important problem because they usually have 
useful properties such as high clustering [3], robustness to random attacks [4] and 
easy achievable synchronization [5]. Several models for producing scale-free net¬ 
works have been suggested; most of them are based on the preferential attachment 
approach [1]. This approach forces existing nodes of higher degrees to gain edges 
added to the network more rapidly in a “rich-get-richer” manner. This paper offers 
a model with another explanation of scale-free property. 

Our approach is inspired by matrix factorization, a machine learning method 
being successfully used for link prediction [6]. The main idea is to approximate a 
network adjacency matrix by a product of matrices V and , where V is the 
matrix of nodes’ latent features vectors. To create a generative model of scale-free 
networks we sample latent features V from some probabilistic distribution and try to 
generate a network adjacency matrix. Two nodes are connected by an edge if the dot 
product of their latent features exceeds some threshold. This threshold condition is 
influenced by the geographical threshold models that are applied to scale-free network 






Artikov et al. 


Page 2 of 21 


generation [7]. Because of the methods used (adjacency matrix factorization and 
threshold condition) we call our model the factorization threshold model. 

A network produced in such a way is scale-free and follows power-law degree dis¬ 
tribution with an exponent of 2, which differs from the results for basic preferential 
attachment models [8, 9, 10] where the exponent equals 3. We also suggest an ex¬ 
tension of our model that allows us to generate directed networks with a tunable 
power-law exponent. 

This paper is organized as follows. Section 2 provides information about related 
works that inspired us. The formal description of our model in the case of an 
undirected fixed size network is presented in Section 3, which is followed by a 
discussion of how to generate growing networks. In Section 4 the problem of making 
resulting networks sparse is considered. Section 5 shows that our model indeed 
produces scale-free networks. Extensions of our model, which allows to generate 
directed networks with a tunable power-law exponents and some other interesting 
properties, will be discussed in Section 6. Section 7 concludes the paper. 

2 Related work 

In this section we consider related works that encouraged us to create a new model 
for complex networks generation. 

2.1 Matrix Factorization 

Matrix factorization is a group of algorithms where a given matrix R is factorized 
into two smaller matrices Q and P such that: R ~ Q^P [11]. 

There is a popular approach in recommendation systems which is based on matrix 
factorization [12]. Assume that users express their preferences by rating some items, 
this can be viewed as an approximate representation of their interests. Combining 
known ratings we get partially filled matrix R, the idea is to approximate unknown 
ratings using matrix factorization R « P. A geometrical interpretation is the 
following. The rows of matrices Q and P can be seen as latent features vectors 
(fi and Pu of items and users, respectively. The dot product {qi,Pu) captures an 
interaction between an user u and an item i and it should approximate the rating 
of the item i by the user u: Rui ~ {<fi,Pu)- Mapping of each user and item to latent 
features is considered as an optimization problem of minimizing distance between 
R and P that is usually solved using SGD (stochastic gradient descent) or ALS 
(alternating least squares) methods. 

Furthermore, matrix factorization was suggested to be used for link prediction in 
networks [6] . Link prediction refers to the problem of finding missing or hidden links 
which probably exist in a network [13]. In[6] it is solved via matrix factorization: a 
network adjacency matrix A is approximated by a product of the matrices V and 
where V is the matrix of nodes’ latent features. 

2.2 Geographical threshold models 

Geographical threshold models were recently proven to have good results in scale- 
free networks generation [7]. We are going to briefly summarize one variation of 
these models [14]. 

Suppose the number of nodes to be fixed. Each node carries a randomly and 
independently distributed weight variable Wi S K. Also, the nodes are uniformly 
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and independently distributed with specified density in a A pair of nodes with 
weights w,w' and Euclidean distance r are connected if and only if: 


{w + w') ■ h{r) > 0, 


( 1 ) 


where 6 is the model threshold parameter and h{r) is a distance function that is 
assumed to decrease in r . For example, we can take h(r) = r~^ , where /3 > 0 . 

First, exponential distribution of weights with the inverse scale parameter A has 
been studied. This distribution of weights leads to scale-free networks with a power- 
law exponent of 2: P(k) oc k~‘^. It is interesting that the exponent of a power-law 
does not depend on the A, d and j3 in this case. Second, Pareto weight distribution 
with scale parameter wq and shape parameter a has been considered. In this case 
a tunable power-law degree distribution has been achieved: P(k) oc k~^~^. 

There are other variations of this approach: uniform distribution of coordinates in 
the d—dimensional unit cube [15], lattice-based models [16], [17] and even networks 
embedded in fractal space [18]. 

3 Model description 

We studied theoretically matrix factorization by turning it from a trainable su¬ 
pervised model into a generative probabilistic model. When matrix factorization is 
used in machine learning the adjacency matrix A is given and the goal is to train the 
model by tuning the matrix of latent features V in such way that A « V'^V. In our 
model we make the reverse: latent features V are sampled from some probabilistic 
distribution and we generate a network adjacency matrix A based on V'^V. 

Formally our model is described in the following way: 

A,j = I[(nl,h;) > 9] 
vl = WiXi G 

< 

Wi ^ Pareto(a, Wo), Xi ~ Uniform( 
i = 1... n, j = 1.. .n 

• Network has n nodes and each node is associated with a d-dimensional latent 
features vector wj. 

• Each latent features vector hj is a product of weight Wi and direction Xi. 

• Directions Xi are i.i.d. random vectors uniformly distributed over the surface 
of (d — l)-sphere. 

• Weights are i.i.d. random variables distributed according to Pareto distribu¬ 
tion with the following density function f{w): 

d Wn <3,+ l 

f(w) = —( — ) {w>wo). (2) 

Wq W 

• Edges between nodes i and j appear if a dot product of their latent features 
vectors {vi,Vj) exceeds a threshold parameter 9. 
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Therefore, we take into consideration both node’s importance Wi and its location 
Xi on the surface of a (d — 1)—sphere (that can be interpreted as the Earth in the 
case of Xi G S'^ C K^). Thus, inspired by the matrix factorization approach we 
achieved the following model behavior: the edges in our model are assumed to be 
formed when a pair of nodes is spatially close and/or has large weights. Actually, 
compared with the geographical threshold models we use dot product to measure 
proximity of nodes instead of Euclidean distance. 

We have defined our model for fixed size networks but in principle our model 
can be generalized for the case of growing networks. The problem is that a fixed 
threshold 9 when the size of a network tends to infinity with high probability leads 
to a complete graph. But real networks are usually sparse. 

Therefore, in order to introduce growing factorization threshold models we use 
a threshold function 6 := 9{n) which depends on the number of nodes n in the 
network. Then for every value of network size n we have the same parameters 
except of threshold 9. This means that at every step, when a new node will be 
added to the graph, some of the existing edges will be removed. In the next section 
we will try to find threshold functions which lead to sparse networks. 

In order to preserve readability of the proofs we consider only the case d = 3 
because proofs for higher dimensions can be derived in a similar way. However, we 
will give not only mean-field approximations but also strict probabilistic proofs, 
which to the best of our knowledge have not been done for geographical threshold 
models yet and can be likely applied in the other works too. 

4 Generating sparse networks 

The aim of this section is to model sparse growing networks. To do this we need 
to find a proper threshold function. 

First, we have studied the growth of the real networks. For example, Figure 1 
shows the growth of a citation graph. The data was obtained from the SNAPl^l 
database. It can be seen that the function y{x) = 4.95a:: log a: — 40a: is a good 
estimation of the growth rate of this network. That is why we decided to focus on the 
linearithmic or sub-linearithmic growth rate of the model (here and subsequently, 
by the growth of the model we mean the growth of the number of edges). 

4.1 Analysis of the expected number of edges 

Let M(n) denote the number of edges in the network of size n. To find its expectation 
we need the two following lemmas. 

Lemma 4.1 The probability for a node with weight w to be connected to a random 
node is 



( 3 ) 


https://snap.stanford.edu/data/ 
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Figure 1: The growth of citation graph Arxiv HEP-PH 


Lemma 4.2 The edge probability in the network is 


Pe = 


1 _ 1 0 
2 2{a+iy^’ 

/ a(ln 0 —2 In'UJo) 
20 “ 0+1 



9 < wg, 

9 > w^. 


( 4 ) 


To improve readability, we moved the proofs of Lemma 4.1 and Lemma 4.2 to 
Appendix. 

The next theorem shows, that our model can have any growth which is less than 
quadratic. 


Theorem 4.3 Denote as R{n) such function that R(n) = o{n^) and R(n) > 0. 
Then there exists such threshold function 9{n) that the growth of the model is R{n): 

3N EM(n) = R{n) (n > N). 

Proof It easy to check that Pe is a continuous function of 9. The intermediate 
value theorem states that Pe{9) takes any value between Pe{9 = 0) = 1/2 and 
Pe{9 = oo) = 0 at some point within the interval. 

Since R{n) = o(n^) and positive, there exists N such that for all n > N, 
0< A(n) < 4 X 

It means that the equation EM(n) = R{n) is feasible for all n> N. □ 


Taking into account Theorem 4.3, we obtain parameters for the linearithmic and 
linear growths of the expected number of edges. 
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Theorem 4.4 Suppose the following threshold function: 9{n) = Dn^ , where D is 
a constant. Then the growth of the model is linearithmic: 

wS^ 

EM(n) = Anlnn(l + o(l)) (n > 

JJa 

where A is a constant depending on the Pareto distribution parameters. 

2a 

Proof We can rewrite inequality n > as Dn^ > Wq and apply Lemma 4.2 in 
the case 6{n) = Dni > Wq 

“ 1) - 21nwo) 

2 20“ V a + 1 (a + 

If we replace 6 by Dn^ , we obtain 

EM(n) = ( aiHDn-) - 21nwo) _ 

4{Dn^Y ^ (a + 1)^ 

(n — / Inn ^ a(lnD — 2Inico)\ 

4£)“ Va + 1 (a +1)^ a + 1 / 

= An\nn{l + o(l)). 




□ 


Theorem 4.5 Suppose that the growth of the model is sub-linearithmic: = o(l), 

^ =o(l). 

Proof Let us consider another model with a threshold function O' [n) = Dn^ and 
the expected number of edges EM'(n). According to Theorem 4.4 and the condition 
there exists a natural number Nn such that 

ninn^' ^ 


Mn > Nu EM'(n) = Anlnn(l + o(l)) > EM(n). 


This also means that for all n > Nu we have 9{n) > 9'{n). Therefore 


Vn > Nu 


no. 

9{n) - 


1 

no 

9'{n) 


1 

D' 


1 

By the arbitrariness of the choice of D, we have = o(l). 


□ 


4.2 Concentration theorem 

In this section we will find the variance of the number of the edges and prove the 
concentration theorem 

Proofs of the following lemmas can be found in the appendix. 
















Artikov et al. 


Page 7 of 21 


Lemma 4.6 Suppose that x, y and z are random nodes. Let P< be the probability 
for the node x to be eonnected to both nodes y and z. Then the variance of the 
number of edges M is 


Var(M) = - P.) + - P|), 


Lemma 4.7 Suppose that x, y and z are random nodes. Let P< be the probability 
for the node x to be connected to both nodes y and z. 

Then 


4 6»2“(a+l)2 r <^0 J + 4 6»“ 


1 _ p _a_ 

^(a+l)2 + 


= 


(a+1)^ (a+2) 

1 _ 1 1 I 1 1 

4 2 (a+l)2 ujj 4 (a+l)2(a+2) ^ ’ 


^ UJQ , 

9 < W^. 


Combining these results, we get the following theorem, that will be needed to 
prove the concentration theorem 


Theorem 4.8 If 9 > Wq, the variance is 


Var(M) = EM + n 


(n — l)(n — 2) 


1 


1 


+-B — 


2(n- 2) 


(EMy 


9°- 9^°-i n(n — 1) 

where A and B are constants which depend on the Pareto distribution parameters. 
Proof According to Lemma 4.6 and Lemma 4.7 in case of 0 > Wq, the variance is 

( 6 ) 
(7) 


Var(M) = - Pe) + - P|). 


= T 


1 w, 


4 6»2a(a+ 1)2 




1 - 2 


(a+ 1)2 (a + l)2(a + 2). 


According to Lemma 4.2, the expected number of edges is 


„. n(n — 1) „ 
EM = - -P,. 


( 8 ) 


Combining (8) and (6), we obtain 

Var(A/) = EM(1 - P.) + nLMAMlp< - EA/(n - 2)P. = EAA+ 


i(n — 1) 


Therefore, 


P< = 


1 Wt 


2a 


4 6»2a(a_p 1)2 

3 a 1 


r-<] + 


1 “L 
4 9° 


2a r 


1 - 2 


(a+ 1)2 


(a+l)2(a + 2)J 0“*^^ 02aC'2+^„C'3 ^0a+^02a^ 
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where Ci, (72, C 3 , A and B are constants depending on the Pareto distribution 
parameters. 

Finally, we obtain 


Var(M) = EM + n 


(n — l)(n — 2) 




2(n 

n{n 


^(EM)2. 


□ 


Theorem 4.9 Concentration theorem 

If 9{n) and EM(n) tends to infinity as n —)■ 00 and (EM(rO)^6i(rt)° “ '^(1); then 


Ve > 0 P{\M - EM| > e • EM) 0, 


where M is the number of edges in the graph. 

Proof According to Chebyshev’s inequality, we have 


P(|M-EM| > e-EM) < 


Var(M)M 


£2 . (EM)2 ■ 

Let us estimate the right part of the inequality. Using Theorem 4.8, we get 


1 


Var(M) 

£2 • (EM)2 ^ e2EM ' (EM)2 ' "6»2“J 


o(»=) ra + sU 


0(t) = 

n 


(9) 


+ 


0(n3) 1 


e2EM (EM)2 L A6i2a 


1 + 


B 




Using the conditions of the theorem, we obtain 


Var(M) 
£2 • (EM)2 


—>■ 0 as n —)■ 00 . 


□ 


Combining Theorems 4.4, 4.5 and 4.9 we obtain the following corollary. 


Corollary 4.10 Suppose that one of the following conditions holds: 

• the threshold function 0 (n) equals Dn^ 

• El%i) = 0(1) and = o(l) 

Then 


V£ > 0 P(|M - EM| > £ • EM) 
where M is the number of edges in the graph. 


-^ 0 , 


In this way we have proved that the number of edges in the graph does not deviate 
much from its expected value. It means that having the linearithmic or the sub- 
linearithmic growth of the expected number of edges we also have the same growth 
for the actual number of edges. 
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5 Degree distribution 

In this section we show that our model follows power-law degree distribution with 
an exponent of 2 and give two proofs. The first is a mean-field approximation. 
It is usually applied for a fast checking of hypotheses. The second one is a strict 
probabilistic proof. To the best of our knowledge it has not been considered in the 
context of the geographic threshold models yet. 

To confirm our proofs we carried out a computer simulation and plotted comple¬ 
mentary cumulative distribution of node degree which is shown on Figure 2. We also 
used a discrete power-law fitting method, which is described in [2] and implemented 
in the network analysis package igraph 1^5. We obtained a = 2.16, a^min = 4 and a 
quite large p-value of 0.9984 for the Kolmogorov-Smirnov goodness-of-fit test. 


Theorem 5.1 Let P{k) be the probability of a random node to have a degree 
1 

k. If = o(l), then there exist sueh constants Cq and Nq such that V k(n) : 
y n > Nq k(n) < Cqu we have 

P{k) = (l + 0(l))fc-2. 


i^'http://igraph.org/ 
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Mean-field approximation This approximation gives power-law only for nodes with 
weights w < ^. But the expected number of nodes with weights not satisfying this 
inequality Em is extremely small 


9 ( \ 

Em = nP{w > —) = n f ^ I = o(l). (10) 

Wo \ 9 J 

As it was shown in Lemma 4.1, the probability of the node vl = Wixl with weight 
Wi = w < :^ to have an edge to another random node is 


Pe{w) = 


Wn 


26l“(a-P 1) 


w . 


Let ki{w) be the degree of the node Vi. Then 


ki{w) = I[vi is connected to vfi, 

where I stands for the indicator function. 

As all nodes are independent, we get 

Ekfiw) = {n- l)Pe{w). 


In the mean-field approximation we assume that ki{w) is really close to its expec¬ 
tation and we can substitute it by (n — l)Pe(w) in the following expression for the 
degree distribution P{k) = where f{w) is a density of weights. Thus, 


P{k) 


2awn9'^(a1) , n 

, „ oc k 

[n — 


□ 


Note that we have not used conditions on k{n) and 9{n) yet, they are needed to 
estimate residual terms in the following rigorous proof. 

Proof Degree ki of the node Vi is a binomial random variable. Using the probability 
Pe(w) of the node Vi with weight Wi = w to have an edge to another random node, 
we can get the probability that ki equals k: 

To get the total probability we need to integrate this expression with respect to w 

P{h = fc) = £ iP,iw)f (1 - 


P{ki = k\wi = w) = 
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Because of Pe{w) is a composite function, the integral breaks up into two parts. 

h = / {Pe{w)f (1 - 

I,= r (P^(^))'=(i_P^(^))"-fc-i^du;. 

Je/wo + 


Thus, 


P(ki = k) 



(/ 1 +/ 2 ). 


For estimating Ji we can use the formula Pe{w) = from Lemma 4.1. 

After making the substitution to integrate with respect to Pe{w) and using the 
incomplete beta-function, we get 


= 


w, 


2a 


26»“a(a-f 1) 


B 


2(a -|- 1) 


; fc — 1, n — fc I — 


-B 


w, 


2a 


20“(a-h 1)’ 


k- 


1, n — 



For I 2 we can derive an upper bound. Note that for w > O/wq we have 


Pe{w) 


1 A a9 \ 1 

2 \ w{a + l)u>o / 2 


1 - Pe{w) < 1 - Pe{9/wo) = ^ -I- = £0 < 1- 

Therefore we obtain the following upper estimate 


I 2 = O 


(eo) 


n—k—1 


awn 


2k 


'Sjwo 


W' 


(2-|-l 


dw = O 


(eo) 


n — k — 1 


Qa2k 


We now combine estimates for Ii , I 2 and the following estimates for the incomplete 
beta-function: 

B{x;a,b) = 

B{x; a, b) = B{a, b) + 'j , 

1 ^ r(n- 1) ^ ^ n'^-^ \ 

B{d-l,n-d) r(d-l)F(n-d) \T{d-l))' 


This gives us 
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P{ki = k) = 




k J 20 “a(a + 1 ) - 


B{k — 1, n — fc) + O 


1_^ 

2(a+l) 




fc 


-O 


\^26l“(a+l) J 
k-1 


k-1 


+ o 


i^o) 


n—k— 1 


Qa2k 


B{k-l,n-k) 


k J 20 “a(a + 1 ) 


1 + 0 


(ei) 


•n-k ^k-1 


(n — k)T(k — 1) 


+ O 


/ \2S-(a+l)) 


k-1 


,fc-l' 


o 


(^o) 


„_/c-i ^fc-i 


\ (A:-l)r(fc-l) / \ 6l“2'= r(fc-l) 

Let us introduce the following notations: 


%—k 


/ \n—k k—^ \ 1 

(£l) n'^ \ . . 1 


^ = o 


B = o 


(n — fc)r(fc — 1) 


I, where £i = 1 — 


2(a + 1) 


2e“(o+i) J 


k-1 


,fc-l' 


(fc- i)r(fc- 1) 

(eo) 


-k-1 ^k-1 


C = 0 \ —r- ^ -r I, where £o = ^ ( 1 H ^— 

I 6»“2'= r(fc-l)/ 2 V a + 1 


Using = o(l), for fc(n) < Cqu we get 


/ ,„2a \fe—1 

(a+u) (f) 

■ ■ r(fc) 


B = o 


k-1 ' 


= 0 ( 1 ). 


If fc(n) is a bounded function, then since £o < 1 and £i < 1 we have 


.4 = o((£i)^n'=-i) =o(l), 
O = o((£o)”-"n'=-^) =o(l). 


If k{n) —^ oo as n —>■ oo, using Stirling’s approximation r(fc— 1 ) 
we get 


VMk-2){j^y~' 


A = 0 



k-2 

— k)\/k — 2 



n 


k -2 



c = o 


f Vk^ 

0“ 



n-k-1 

k-1 


k-2J 
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Since e^x —>■ 0 for e < 1 as a; —)■ oo there exist constants Cq and Nq such that 

n — k n — fc — 1 

for n > Nq and k{n) < Cqti we have < 1 and (^o) < 1- This 

implies that A = o(l) and C = o(l). 

Thus, we obtain 


P{ki = k) = {1 + o{l))^ - l,n-k) = {1 + o{l))k (11) 


Note that regardless of the shape parameter of the Pareto distribution of weights 
we always generate networks with a degree distribution following a power law with 
an exponent equals 2. In the next section we modify our model in order to change 
the exponent of the degree destribution and some other properties of the resulting 
networks. 

6 Model modifications 

In this section we will show how to modify our model to get new properties and 
how these modifications will affect the degree distribution. 

6.1 Directed network 

Many real networks are directed. In order to model them and obtain an exponent 
of the power law that differs from 2, we changed the condition for the existence of 
an edge. There will be a directed edge {vi,Vj), if and only if 


{wfxijWj Xj) > 6,a, 13 > 0. 

As it follows from the next theorem this modification allows us to tune an exponent 
of the power law. 

Theorem 6.1 Let Pout{k) he the probability of an random node to have out-degree 
k, Pin{k) - in-degree k. //= o(l), then there exist constants Cq 
and Nq such that yk{n) : Vn > Aq k{n) < Cqu we have 

Poutik) = (1 + o(l))fc-i-“/^ = (1 + o(l))fc-i-^/“. 

Proof Here is a proof for the out-degree distribution. The case of the in-degree 
distribution is similar. 

Firstly, let us compute Pe(vj) - the probability of the node u) = WiXi with weight 
Wi = w to have an edge to another random node. 


Peiw) 



f{w') 


' a;^G5(0,l) 

{w^ X ,{w'x')>B 


—dx'dw'. 

Itt 


( 12 ) 
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Figure 3: Example in of influence h{x) = x, h{x) = e“, h{x) = x^ 


Similarly to Lemma 4.1 we get 


Pe{w) = r 

J mj 


awQ 1 




^ 1 - 


dw'. 


(13) 


Thus, we obtain 


w‘>(a+0)w^)’ 


PJw)= ^ <«“(a+/3)™o 

2e“//3 \a f3+a J ’ u)“ J 

Like in Theorem 5.1 we have 


\ 1//3 


(14) 


P{ki = k) = 


n — 1 




{p,{w)ni-p,{w)r 


W' 


tt-j-1 


The rest of the proof is similar to the corresponding steps of Theorem 5.1, so we 
omit details here. 

□ 


With a = f3 this model turns into the undirected case with the power law exponent 
equals 2 that agrees with Theorem 5.1. 

6.2 Functions of dot product 

In our model because of the condition WiWj{xi,x'j) > 0 >0 node vl can only be 
connected to the node Vj if an angle between Xi and af) is less than tt/2. This is 
a constraint on the possible neighbours of a node that restricts the scope of our 
model. 

We can solve this issue by changing the condition for the existence of an edge: 
wfw^h{{xi,Xj))>e, (15) 


where h : [—1,1] —)■ K. On Figure 3 is an example of how it works in 
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Theorem 6.2 Let Pout{k) be the probability of an random node to have out-degree 
k, Pinik) - in-degree k. = o(l) and h : [—1,1] —>■ K - continuous, 

strictly increasing function, positive at least in one point from (—1,1), then there 
exist constants Cq and Nq such that \/k(n) : Vn > iVo k{n) < Cqu we have 

Pout{k) = + o{l)),PUk) = k-^-P/^{l + o(l)). 

Short scheme of proof Here is the scheme of proof for the out-degree distribution. 
The case of the in-degree is similar. 

Restrictions on the function h allow us to modify the proof of the directed case. 
The main difference is a value of the probability Pe{w) of a node Vi = WiXi with 
the weight Wi = w to have an edge to another random node. 


Pe 



awQ 

(ry/)a+l 



{w')^ h{{x,x'))>9 


-—dx'dti;'. 
47r 


We will denote by I the inner integral: 



x'gS^ 

(w')^ h({x,x'))>9 


-—da;'d?c'. 
47r 


(16) 


(17) 


We can rewrite inequality (15) as h{{x,x')) > and notice that G 

(0, -l-oo). Let us consider h([—1,1]) = [r, g], on this interval function h is invertable. 
We examine the mutual position of [r, g] and (0,-|-oo). The definition of h implies 
that [r, g] n (0, -l-oo) 0. This gives us the next two cases. 

A) The first case is [r, g] C (0, -|-oo). If G [r, g], then we may invert h and 

the inner integral / is equal to 2tt ^1 — h~^ ^ 

inequality (15) is not satisfied and / = 0. If 0 < < r, than the inequality (15) 

is satisfied for any pair of x and x', I = 47r, the surface area of 5^. 

To deal with Pe(w), we need to compare wq with boundaries for each range of 

9 

W°^ {w’)^ ' 

1) If Wo < : then 


ei//3 


Pe{w) = / Odw 

Jwo 


01/3 


aWn 




0i/3 (?ii')“+^ 2 

c«/3„i/3 1 ' 


-\l — h ^(-T—;^)ldr(;'-(- 

2 ^ ^ w°‘{w r ^ 
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47r ° 

gl/ft (w'r+^ 


dw'. 


2) If 
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DC / l3 J’l / 0 1 
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0i/3 

/■„<»/3,.l/3 QWq Ir ,, 9 M , , 

(w) = / 7 — ,, -\l — h ( -;—- 7 ^)ldr(; -|- 

V / / (i(;')“+i2'- 
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3) Last case is wq > ft ■ But Q[n) grows with n and for big enough n this 

inequality will not be satisfied. 

B) The second case is [r,q\ (/L (0,+oo), which implies r < 0. If G (0, g], 

then I = 2tt(i- h-^ ■ If > Q, then 1 = 0. This gives 


Pe{w) = 




(^/)a+l 2 L 




)]di(;' 


It remains only to show that Pout{k) = k~^{l + o(I)). But now it is easy to see 
that the influnce of every kind of the principal parts of the integral for Pe{w) has 
been already examined in previous theorems for degree distributions. For example, 


J: 


81//3 

aWn 


gi/f3 (w'p+^ 2 
.//3„l//3 ^ ^ 




a^..2aaj0 


WqW 


P8a//3 


W°^{w')P 

J r 


)]dw' = 


what is proportional to the one we got in Theorem 6.1. Therefore we are not giving 
here additional details. 

□ 


For example, described class of functions contains functions like and + c, 

m G N, for a proper constant c. 

Of course, not only this small class of functions h{x) has no influence on the degree 
distribution. For example, it is easy to show that h{x) = G N also has this 

property. In this way, a proof will be different only in the computation of Pe{w). 


7 Conclusion 

In our work we suggest a new model for scale-free networks generation, which 
is based on the matrix factorization and has a geographical interpretation. We 
formalize it for fixed size and growing networks. We proof and validate empirically 
that degree distribution of resulting networks obeys power-law with an exponent of 

2 . 

We also consider several extensions of the model. First, we research the case 
of the directed network and obtain power-law degree distribution with a tunable 
exponent. Then, we apply different functions to the dot product of latent features 
vectors, which give us modifications with interesting properties. 

Further research could focus on the deep study of latent features vectors distri¬ 
bution. It seems that not only a uniform distribution over the surface of the sphere 
should be considered because, for example, cities are not uniformly distributed over 
the surface of Earth. Besides, we want to try other distributions of weights. 














Artikov et al. 


Page 17 of 21 


8 Appendix 

8.1 Proof of Lemma 4.1 

For a node x with the weight w, the probability to be connected to a random node 
is represented by 


Peiw) 



fiw 




ww' {x,x)'>9 


—Ax'ilw'. 

47T 


(18) 


We can rewrite inequality ww'{x,x') > 9 as {x,x') > If S [0,1], this 
inequality defines the spherical cap of the area 27r(l — Therefore, we have 


Pe{w) = f 
J rr 


m.ayi{'WQ^9 / w} 


f{w')2TT 1-- I —dw'. 
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J mi 


/max{ioo,0/tu} ^0 

If w < 9/wo, then 
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obtain 


— ( 

rwo 

.a+l 1 

M 

Wo 

< w' 

) 2 

\ ww' J 


( 20 ) 
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w') 


Je/w 2wo ^ 

, w' / 
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aw^9 
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1 

wg 

2 

a {9/w 

f 

2w [a 

+ l)(0/w)“+i 

2 9' 

"(a + 1) 


w . 


If m > 9/Wo, then 



awo 1 awo9 1 1 A a9 

2 gWq 2w (a + l)rCo^^ 2 \ w{a + l)wo 


8.2 Proof of Lemma 4.2 

The edge probability is represented by 


P. = 


nOO p pOO p -I 

•^-0 V .16^2 


dx'dic'dxdw. 


( 21 ) 


Using (18), we obtain 


pOO p 2 pOO 

/ / —f{w)Pe{w)dxdw = / f{w)Pe{w)dw. 
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If 0 < Wq, then for all w S [wo,oo) Pe{w) equals to ^(1 — ^i^a‘+i)wo )' Using it, we 
get 


Pe = 

_ 1 

“ 2 ” 


^“1(1- 
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w[a + Ijwo 
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dw = -- -a^0- 
2 2a 
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If 0 > Wq, then 
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° dw 
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2a 


20“ 2(a+l)2 0“ 


8.3 Proof of Lemma 4.6 

Let us enumerate pairs of nodes. Each pair of nodes i has an edge indicator /g.. 
By definition, we have 


Var(M) = E(M2) - E{Mf = E{h, + ... + + • ■ • + 

^ ^ e/2 + 2 ^ EhJ.^ - 5](E/,J2 _ 2 ^ EUm,^. 

i i=jLj i 


le,, ..., is the sequence of identically distributed random variables, so 

their expected value is the same and equals to Pe- 
Since E/^. = E/g^ = Pe, h follows that 

EhJo^ - !+^(P,)2 - 2;^E/e.E/,^. = 

= - Pe) + 2 E/eJe, " 2 ^ EIe,EIe^ . 

i¥^j i¥=j 


If edges Ci and Cj do not have mutual nodes, then Ig. and Ig^ are independent 
variables. Therefore, E(/eTe 2 ) = E(/ejE(Je^.) = P^. We get 























Artikov et al. 


Page 19 of 21 


Var(M) = - Pe)+ 

n n n 

“ 1 “ ^ ^ ^ ^ ^ ^ {^^e{v,w)^e{v,z) '^^e{v,w)^^e{v,z)) 

V — 1 W — 1 2=10+1 

107^0 z^v 


n{n — 1) 


n n n 


Pe(l-Pe)+EE E 


0=1 10=1 2=10 + 1 
w^v z^v 


^Ie(v,w)Ie{v,z) is exactly equal to P<. 

8.4 Proof of Lemma 4.7 

It can be easily seen that 
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Computing the first integral, we get 
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And for the second one we have 


Je/wn 4 w(a - 


Je/wo 4 ^ + L)wo 

= r ^ 

Je/wo 4 

poo ^ 


a0 




/— 1 a26»2 I 

9/wo 4:w^{a + l)^w^°'w 


'e/wo "2 w{a + l)wo‘^ 


Wq 

—-7dw+ 

..a+1 


Wq , 1 „ 

—= -awn 

,0+1 4 u 


pOO 


h/WQ 


W' 


,a+l 


dw— 


la^e^ /•- 1 , , l a^e^w^Q-^ /•- ^ 

2 a + 1 Je/wo ^4 (a + l)2 

1 la^0WQ~^WQ^^ 1 a36»2w“-2 ^,“+2 

~ 2 (a + 1)2 6»“+i 4 (a+ l)2(a + 2) 6l“+2 

1 Wq® 1 ii;o“ 1 ^o“ 

4^ ~ 2 (a+1)2^ 4'(a + l)2(a + 2)^' 


This gives us P< in the case of 0 > iCg: 
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